part-of-speech tagger

Terms from Artificial Intelligence: humans at the heart of algorithms

Page numbers are for draft copy at present; they will be replaced with correct numbers when final book is formatted. Chapter numbers are correct and will not change now.

A part-of-speech (POS) tagger takes text and annotates each word (or sometimes group of words) a tag representing what part of speech it is being used in, for example 'noun phrase' or 'adjective'. It is often used as an early stage of natural language pricessing before syntactic analysis and further stages. Note the same word may be used in different ways in different sentances, for example, 'going' is a verb in "I am going to the shops", but a noun in "his going will be rued". This can sometimes help in disambiguating synonyms, such as 'bow' the action (typically used as a verb) vs the weapon (used as a noun). However, this does not solve all ambiguity, for example the difference between 'bow' the weapon and back of a ship requires semantic disambiguation. Typically POS tagger is followed by higher levels of parsing to create a grammar tree or other representation, but it may be used on its own in simple applications.

Used in Chap. 13: page 212

Also known as POS tagger

Example output of CLAWS WWW tagger with three meanings for ‘base’. Note that the tagger copes with the typing error ‘by’